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Abstract 

An old problem raised independently by Jacobson and Schonheim asks to determine the 
maximum s for which every graph with m edges contains a pair of edge-disjoint isomorphic 
subgraphs with s edges. In this paper we determine this maximum up to a constant factor. 
We show that every m-edge graph contains a pair of edge-disjoint isomorphic subgraphs with 
at least c(m log m) 2 / 3 edges for some absolute constant c, and find graphs where this estimate 
is off only by a multiplicative constant. Our results improve bounds of Erdos, Pach, and Pyber 
from 1987. 

1 Introduction 

The decomposition of a given graph into smaller subgraphs is an old problem in graph theory that 
has been studied from numerous perspectives. A celebrated result of Wilson [16] asserts that given 
any fixed graph H, the edge set of any sufficiently large complete graph K n can be partitioned into 
edge-disjoint copies of H, as long as the obvious necessary divisibility conditions e(H) | r£) an d 
g | n — 1 (where g is the greatest common divisor of the degrees of H) are satisfied. 

A factor of a graph is a spanning subgraph, and a factorization is a partition of its edges 
into factors. A series of papers by Graham, Harary, Robinson, Wallis, and Wormald (see, e.g., 
[T], [9l [10], [TTJ [15] ) introduced the systematic study of isomorphic factorizations, in which the resulting 
factors are required to be isomorphic to each other as graphs. In this literature, a graph G is said 
to be divisible by an integer t, or t-divisible, if G admits an isomorphic factorization into t parts, 
although the analogy with the number-theoretic notion of divisibility is only syntactical. The notion 
of 2-divisibility has also been termed bisectable, with some authors tagging on the extra condition 
that the resulting factors were also connected graphs. 

The earliest work concerned the divisibility of the complete graph. Extending a partial result of 
Guidotti [8], Harary, Robinson, and Wormald [10] proved that the complete graph K n is divisible by 
any integer t which satisfies the obvious necessary condition t \ Q) . Most other existing research 
on divisibility concentrates on trees and forests, perhaps because their simple structure appears 
more tractable. Algorithmically, Graham and Robinson proved in |?j that it is NP-hard to decide 
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whether a tree is 2-divisible, while Harary and Robinson [9] discovered a polynomial-time algorithm 
to decide whether a tree admits a isomorphic factorization into two connected graphs. The best 
general result on trees is due to Alon, Caro, and Krasikov [1], who showed that every m-edge tree 
can be made 2-divisible by deleting only 0(m/ log log m) edges. 

Once one considers general graphs, however, it becomes essentially impossible to hope for 2- 
divisibility or even closeness to 2-divisibility. It is therefore natural to ask what is the largest 

2- divisible subgraph which must exist in a given graph. This problem (stated below in generality 
for hypergraphs) was originally raised independently by Jacobson and Schonheim. 

Problem 1.1. Let the self- similarity of an r-uniform hypergraph G, denoted l(G), be the largest 
integer s for which G contains a pair of edge-disjoint isomorphic sub-hypergraphs with s edges each. 
For each positive integer m, let i r (m) be the minimum of l{G) over all r-uniform hypergraphs with 
m edges. Determine i r {m). 

Remark. This paper focuses on graphs (r = 2), so we will write t(m) instead of i2{mn) throughout. 

The first main result in this area was due to Erdos, Pach, and Pyber [1]. Specifically, they 
proved that there were absolute constants c r and C r for which 

cvm 2 /^" 1 ) < t r {m) < C r m 2 /(r+D . lo f m . 

log log m 

Their upper bound construction is based on an appropriately chosen random r-uniform hypergraph. 
For graphs (r = 2), the powers of m coincide at m 2//3 , so their lower bound deviated only by a 
logarithmic factor from their upper bound construction, which was essentially the Erdos-Renyi 
random graph. At around the same time, similar results were obtained independently by Alon 
and Krasikov (unpublished), and by Gould and Rodl. The latter group determined in [6] that for 

3- uniform hypergraphs, iz{m) > ^^/m, which matched the upper bound exponent, but again fell 
short by a logarithmic factor. Very recently, Horn, Koubek, and Rodl [13J announced lower bounds 
for i±{m), L§{m), and iQ{m) which also came within poly-logarithmic factors of the corresponding 
upper bounds derived from random hypergraphs. 

The main result of our paper completely solves the graph case, determining the asymptotic rate 
of growth of t{m) = L^irn). 

Theorem 1.2. There are absolute constants c and C for which 

c(m log m) 2 / 3 < i{m) < C(mlogm) 2 / 3 . 

The key idea is to exploit rare large deviations events through a constructive algorithm, rather 
than to attempt to erase them with union bounds. Incidentally, our upper bound construction is 
still based on a random graph, but with a slightly modified edge probability. 

Inspired by the asymptotic optimality of random graphs in the problem of Jacobson and 
Schonheim, our next result explicitly studies the self-similarity of random graphs. The Erdos-Renyi 
random graph G n ^ p is constructed on the vertex set [n] = {1, . . . , n} by taking each potential edge 
independently with probability p. We say that G ntP possesses a graph property V asymptotically 
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almost surely, or a.a.s. for brevity, if the probability that G n ^ p possesses V tends to 1 as n grows 
to infinity. Since its first appearance in the 1960's, this beautiful object has been a central topic of 
study in graph theory. Surprisingly, many problems about random graphs arose from research in 
various other areas of mathematics and theoretical computer science. Yet despite the great amount 
of work devoted to this topic over the past fifty years, many interesting unresolved questions still 
remain to be answered. For more on random graphs, we refer the reader to the books [5j 114]. 

When p < it is well known that a.a.s. all connected components of are either trees 
or unicyclic (are trees with a single additional edge). Applying the previously mentioned result 
of Alon, Caro, and Krasikov, or even Proposition 12.31 below, it is then easy to see that the self- 
similarity of G n , p in that regime is a.a.s., where m is the number of edges. Our second result 
asymptotically determines i(G niP ) for the remaining range of p. 

Theorem 1.3. 

(i) If±< P(n) < j,^, then ,(G n , p ) = (n • a.a.s., where 7 (n) = \^ . 

(ii) Ifp(n) > then i{G n , p ) = @(n 2 p 2 ) a.a.s. 

We will prove this theorem in the next section. Its proof illustrates the main ideas of the 
argument for Theorem 11.21 which follows in Section O 

Notation. Let G be a graph with vertex set V . For a subset of vertices X C V, let G[X] be the 
subgraph of G induced by X. For a vertex v 6 V , we use N(v) to denote the set of neighbors of 
v. Given a bijection / : V — > V, let /(G) be the graph with vertex set V', where x',y' £ V' are 
adjacent if and only if there exist two adjacent vertices x, y G V such that f{x) = x' and f(y) = y' . 
For two graphs G\ and G2 defined on the same vertex set, let G\ U G2 be the graph obtained by 
taking the union of the edge sets of the two graphs, and let G\ n G2 be the graph obtained by 
taking the intersection of the edge sets of the two graphs. 

The following standard asymptotic notation will be utilized extensively. For two functions f(n) 
and g(n), we write f(n) = o(g(n)) if lim n _j.oo f(n)/g(n) = 0, and f(n) = 0(g(n)) or g(n) = H(/(n)) 
if there exists a constant M such that |/(^)| < M\g(n)\ for all sufficiently large n. We also write 
f(n) = Q(g(n)) if both /(n) = 0{g{n)) and f(n) = £l(g(n)) are satisfied. All logarithms will be in 
base e rj 2.718. 

2 Random graphs 

We will use the following well-known concentration result, which is a consequence of Theorems 
A. 1.11 and A. 1.13 in the book [2]. Let Bin(n,p) denote the binomial random variable with param- 
eters n and p. 

Theorem 2.1. If X ~ Bin{n,p) and A < np, then 

_ A 2 

P[|X - np\ > A] < e 15 "p . 
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We begin by analyzing the self-similarity of random graphs. In addition to being an interesting 
question in its own right, this investigation also suggests good intuition for general graphs. The 
upper bounds on i{G n ^ p ) follow from relatively straightforward union bounds. 

Proof of upper bound in Theorem \l.S\ Suppose that we are seeking a pair of edge-disjoint isomor- 
phic subgraphs with t edges. This task is equivalent to finding subgraphs H' with 2t edges that 
can be partitioned into the union H U tt(H), for some i-edge subgraph H and a permutation tt of 
the vertex set. The expected number of such subgraphs H' in G n . p is at most 



t ) \ t 

where the first binomial coefficient counts the number of ways to select t edges for H out of all 
(2) available, and the n! bounds the number of permutations tt of the vertex set. Together, these 
choices determine the 2t edges which make up H 1 , which appear with probability p 2t . Thus, if we 
select a value of t for which the right hand side of ([T|) becomes o(l), we will establish that the 
number of such H' is zero a.a.s., and hence i(G ntP ) < t a.a.s. 

We separately specify suitable choices for t for the two regimes of p that we consider in this 

theorem. For part (i), where ^ < p < -^\j~^^ L i we use t = "og^" > w here 7 = Note that 

in this range we have e 6 < 7 < 2y 'ralogn. Then the right hand side of ([!]) becomes 



en 



n log n 

2.1ogn\TSiT / -1 «i2£Ii niogn, , 

7 n I „n log n / u o I \ n log n log 7 6 V e log 7 / n log n 



1 1 1 e = k — e = e \ / • e 

I ralogn / V ry2 J 

\ log 7 / \ ' / 

Since 7 > e 6 , we have log ( el Q g7 ) > flog 7, and hence the right hand side of (JTJ) is at most 



e -§nlogn . e nlogn _ (1). 

For part (ii), where p > ^\f^, we specify t = e 12 n 2 p 2 . The right hand side of ([1]) then 
becomes 

e nlogn < f _ ) e «logn _ Q ^ _ 

□ 
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The remainder of this section is devoted to constructing large self-similar subgraphs in G n;P . 
The structure given in the following definition turns out to be extremely useful (both for this section 
and the next section). 

Definition 2.2. Let d and k be positive integers. 

(i) A d-star is a graph consisting of d + 1 vertices and d edges, where one of the vertices has 

degree d. We sometimes simply refer to these graphs as stars, 
(ii) A (d, k) -star- forest is a collection ofk vertex- disjoint d-stars. We denote a (d, k)- star- forest 
by the set of pairs {(v,N v ) : v £ B}, where B is a set of k vertices, and for each v, the set 
N v C N(v) is a disjoint set of d neighbors of v. 
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The following two propositions were the key ideas in [4]. We include their proofs for complete- 
ness, as well as to illuminate the points at which we introduce our new arguments. The first claim 
asserts that the self-similarity of a graph is large if there are many non-isolated vertices. 

Proposition 2.3. Let G be a graph on n vertices with no isolated vertices. Then i(G) > ^i^. 

Proof. We first prove that G contains vertex-disjoint stars that cover all the vertices of the graph. 
Given a graph G, iteratively remove edges that connect two vertices of degree at least two (in an 
arbitrary order). Clearly, this process never creates isolated vertices, and the final graph consists 
only of stars because all remaining vertices of degree two or more are non-adjacent. 

It remains to show that any n- vertex star forest contains two large edge-disjoint isomorphic 
subgraphs G\ and G<i- We consider the stars in the forest by their type. Note that 1-stars are 
nothing more than single edges, so for every two 1-stars, we can put one of them in Gi and the 
other in Gi- We account for this as a contribution of +1 toward i{G) from the four vertices in the 
two 1-stars. On the other hand, for d > 2, we can split the edges of every d-star into two sets of size 
|_|J, possibly with one edge left over. By adding one part to G\ and the other to G2, we see that 
the d + 1 vertices of each <i-star contribute + |_|J to t(G). Accumulating the contributions from all 
vertices, except possibly for at most two vertices from a single unpaired 1-star, we find that 



KC)>(n-2). m m{i,™„{m}} 



n 



□ 



Although our problem considers the self-similarity within a single graph, our lower bound argu- 
ment first separates the given graph into two disjoint subgraphs, and constructs a suitable mapping 
between them which overlaps many edges. 

Definition 2.4. Let G\ and G2 be two edge-disjoint graphs, on possibly overlapping vertex sets V\ 
and V2 of the same cardinality. Let their similarity l(G\, G2) be the maximum integer s such that 
there exists a bijection f : V\ — > V2 for which f(G±) n G2 contains s edges. 

The next proposition uses a random mapping as the input in Definition 12.41 in order to measure 
similarity of two random bipartite graphs. 

Proposition 2.5. For i = 1,2, let Gi be edge-disjoint bipartite graphs with parts A\ and Bi, where 
\A\\ = 1 = n i an d \L>i\ = \L>2\ = U2- Suppose that A± U A2 and B\ U B2 are disjoint, but A\ 
may intersect A2 and B\ may intersect i?2- Then i{G\,G2) > ^^^ G2 ^ ■ 

Proof. Independently sample uniformly random bijections from A\ to A2 and from B\ to B2, and 
let / be their combination. For each pair of edges e\ G E(G\) and e2 £ E(G2), the probability that 
e\ gets mapped to C2 by / is exactly Such a situation contributes +1 to the intersection size 

f{G\) n G2. Therefore, by linearity of expectation, the expected number of edges in f{G\) n G2 is 
at least \ E ( Gl ^\ E ( G2 ^ anc i there exists a suitable f which achieves that bound. □ 

nin2 ' J 

Corollary 2.6. Let G be a bipartite graph with parts A and B such that \E(G)\ > 10. Then 

- 5\A\\B\ ■ 



5 



Proof. Arbitrarily partition G into two edge-disjoint subgraphs G1UG2 with [^^(G)!] > - — ^ — > 
9 ^q G ^ edges, and apply Proposition 12.51 □ 

2 

Corollary 2.7. Let G be a graph with n vertices and m edges, where m > 20. Then i(G) > ^V- 



Proof. Let AuB be a bipartition of the vertex set of G chosen uniformly at random. The probability 
of a single edge intersecting both parts is exactly i, and thus by averaging, there exists a bipartition 

2 

AUB for which the bipartite graph H between A and B contains at least ^ edges. Since < ^ 

and m/2 > 10, by Corollary we have t(G) > = □ 

To prove Proposition 12.51 we considered a random bijection between the two vertex sets, as 
there exists a map such that the resulting number of overlapping edges is at least its expectation. 
This strategy turns out to be strong enough when the graph is dense. On the other hand, for sparse 
graphs, Proposition 12.31 produces a reasonable bound. These were the key steps used by Erdos, 
Pach, and Pyber in [3]. In order to establish Theorem 11.31 however, we need something slightly 
more powerful for the intermediate edge density regime. 

The key new ingredient is to design a vertex permutation that performs better than a uniformly 
random one. To sketch our argument, consider the illustrative case p = n -1 / 2 , which represents 
the most delicate range. We first randomly split the vertices into four parts A±, A2, B±,B2 of equal 
size, and let Gj be the bipartite graph formed by the edges between Ai and Bi. We discard all other 
edges, and bound only the similarity between G\ and G2. Rather than searching for an unstructured 
permutation of the whole vertex set, we build a favorable bijection / : A\UBi — > A2UB2 which sends 
Ai to A2 and B\ to B2 with many overlapping edges. Note that if we let / be a uniformly random 
bijection from A\ U B\ to A2 U B2, then we essentially recover Proposition 12.51 thus producing a 
lower bound of order only O(n), which falls short of Theorem 11.31 by a logarithmic factor. 

We start with a uniformly random bijection from B\ to B2, and carefully extend it from A± to 
A2 as follows. Consider a fixed vertex v\ in A\ and a fixed vertex V2 in A2. If we mapped v\ to 
V2, we would increase the number of overlapping edges by exactly \ f(N(v\)) C\N(v2)\, where N(vi) 
represents the set of neighbors of v i in Bi. (Recall that we discarded all other edges, so the Vi only 
have neighbors in their corresponding Bi.) Since we have p = n _1//2 , if V2 is chosen uniformly at 
random, the expected size of the set f{N{v{j) n N{v2) is some constant A, and this observation led 
to the 0(n) lower bound when considering a uniformly random bijection. 

The crucial observation is that for each individual pair of Vi, the overlap \f(N(vi)) n N(v2)\ 
asymptotically has the Poisson distribution with mean A. Therefore, with probability at least n~ £ , 
it will be of size at least e' , ^f"" for some small constants e and e' . Since A2 has ? vertices, 

log log n ^ 4 ' 

the expected number of vertices v<i € A2 that will give this high gain together with v\ is fi(n 1_£ ). 
In particular, it is very likely that there exists a suitable vertex V2 for v\ such that \f{N{v\)) n 
-^(^2)1 ^ £/ iogiogn ' an< ^ we WU ^ ma P v i to V2 in such a situation. By repeating this for a constant 
proportion of vertices in A\, we will obtain t(G niP ) > Q,(n • lo 1 °^ n ). Since 7 = \/\ogn, this gives 
i(G n>p ) > Q(n ■ ^g^-) ) for our choice of p. Our next two lemmas formalize this intuition. 
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Lemma 2.8. Let n and p satisfy n 40 < p < -L y/l°jp ; and define 7 = Ni,...,N s C 

B be s > n 1//3 disjoint sets of size and consider the random set B p , where we take each element 
of B independently with probability p. Then with probability at least 1 — e -^("- 1/12 ) ; there is an index 
i such that \B p HNi\ > 

Proof. Let t = 20°og7 ■ ^ n our ran S e °f P> we always have 2 < i < |"^§^] , so in particular 
t < 1^^. For a fixed index i, the probability that |5 p niV;| > gj^!- is at least (^^(l-p^K 
Using the bounds (^) > (^) fc and (1 — p) > e _ Ts p for small p, we have 



(l-.l) p . (1 _ p) „M,(^)' e -, 



log n 



np 2 /15 = I 1U S u \ e -np 2 /15 



WYt 



t 



/101og7Y ogn/(101os7) 1/(15el2) 
- V I67 2 / 

log n i / lQ-y^ \ 

_ e ~l0lo g7 ' lo g^l0lo g7 J _ n -l/(15e 12 ) 



which by log (^j^p^j < 2 log 7 (deduced from 7 > e 6 ), is at least 



lo; 



e 



r. n - 1 /(^ 12 )> n - 1 /4. 



Hence the expected number of indices i such that \B p D iVj| > 2 o°iog7 * s a ^ l eas t s ■ ^ _1//4 > n 1 / 12 . 
Since the sets iVj are disjoint, the above events for different choices of i are mutually independent. 
Therefore, by Chernoff's inequality, with probability at least 1 — e - ^™ 1 12 \ we can find an index i 
(indeed, several) for which \B p n iVj| > 2 oiog 7 • '-' 

The previous estimate enables us to bound the similarity between random bipartite graphs. 



Lemma 2.9. Let n and p satisfy n « <p< \Jl°&n. and let 7 = W^p. Let A 1 ,Bi,A 2 ,B 2 be 
disjoint sets of size j each, and for each i = 1,2, let Gi be a random bipartite graph with parts A4 
and Bi, where each edge appears independently with probability p. Then l(G\,G 2 ) > ^"^^ a.a.s. 

Proof. Start with a uniformly random bijection / from B\ to B 2 , and also expose all edges in the 

21 

random bipartite graph G 2 . Since p > n *o , Chernoff's inequality and a union bound establish 
that a.a.s., all degrees in G 2 are between ^ and np. Condition on this event. We expose the 
edges in the bipartite graph Gi by iterating over the vertices in A\, exposing each vertex's incident 
edges in turn. Consider the following greedy algorithm for finding a bijection between A\ and A 2 . 
Let A[ be the set of vertices in A\ whose edges have been exposed, and suppose that we have an 
injective map / : A[ — > A 2 such that for all x € A[, f(N(x)) and N(f(x)) intersect in at least 
vertices. Let A' 2 = f(A[), and let A'( = A i \A' i for i = 1,2. Suppose that \A'{\ = \A%\ > ^ 
at some point of the process. 

We first prove that the graph A 2 U B 2 contains a n 1 / 3 )-star-forest. Indeed, let k be the 
largest integer such that there exists a /c)-star-forest {(x, N x ) : x £ X} for some set X C A' 2 ' of 
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size \X\ = k, and suppose that k < n 1 / 3 . Let N(X) be the union of all neighborhoods of vertices 
in X. We know that for every vertex w £ A'^ \ X, we have \N(w) n N(X)\ > (| - yg) np > y§ as 
otherwise we find a (yg, A; + l)-star-forest, contradicting maximality. Therefore, there are at least 



,2., 



Tef ' (1-^-2 1 ~~ 1-^1) — T28 e dges between the sets A 2 ' \ X and N(X), and in particular, the set N(X) 
has at least yJ? incident edges in G2. Note that |iV(X)| < knp < n 4 ^p, since we conditioned on 
all degrees in G2 being at most np, and by the same reason, the number of edges incident to N(X) 
must be at most n 7 / 3 p 2 < y^f, contradiction. Therefore, we have k > n 1 / 3 , as claimed. 

Now take any vertex v\ 6 vl'/, and expose its edges to B\. Its neighborhood N(v\) is a 
random subset of where each vertex of B\ appears independently with probability p. Since the 
bijection / : B\ — )■ B2 was fixed from the outset, the image of the neighborhood f(N(vi)) is also a 
random subset of B2 with the same product distribution. By Lemma l2.8t with probability at least 
1 _ g-nCn 1 /")^ we can find a yertex V2 e x C A'2 such that \f(N( Vl )) n N V2 \ > gj^, where X 
and N V2 were from the star forest constructed above. Define f(vi) = 1)2 and repeat the procedure. 
Since the probability of success at each round is 1 — o(n _1 ), we can successfully iterate times 
a.a.s., and then finish by extending / by an arbitrary bijection between the non-mapped vertices 
of A\ and Ai- In this way, we obtain a bijection / such that the number of edges in f(G±) n G2 is 
at least ^ • afe = ilfe> as desired. □ 

We are now ready to prove the lower bounds of Theorem 11.31 

Proof of lower bound in Theorem \1.3l Part (i) has two subcases. First, for i < p < n -21 / 40 , note 

that 7 = \\f^ > n l / m ^fk>gn, so the desired lower bound is of order n ■ = 0(n). In this 
range, the number of non-isolated vertices is 0(n) a.a.s., so Proposition 12.31 completes this case. 

For the next range n~4o < p < ^ y^^fp, we apply Lemma \2M after splitting the vertex set into 
four parts. Part (ii) follows directly from Corollary 12.61 □ 



3 Self-similarity of general graphs 

Although general graphs are not intrinsically random, we apply probabilistic techniques to find large 
edge-disjoint isomorphic subgraphs. The outline of our proof for general graphs is similar to that 
for random graphs (see the discussion following Corollary |2.7l in the previous section). The key idea 
is to exploit tail events in the Poisson distribution. However, establishing this was somewhat easier 
for random graphs since we had independence, and could expose edges in a controlled manner. For 
general graphs, there are no random edges to expose. Instead, we turn to star forests, which were 
also an important component in the proof of Lemma 12.91 

Let G be a given graph on n vertices with average degree d. As before, we begin by randomly 
splitting the vertices into four parts A\ , A2 , B\ , B2 , and consider the bipartite graphs Gi formed 
by the edges between Ai and B{. We attempt to find a total of f](n 1-a ) many (|, n a )-star- forests 
Sij = {(v,N v ) : v G Xij} for i = 1,2, 1 < j < 0(n 1_a ), where the sets X^j C Ai are disjoint 
for different indices. Note that IJJQ,- then cover a constant fraction of each Ai, and hence the 
edges in these star forests constitute a constant fraction of the edges in the entire graph G. If we 
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fail to find such star forests, then we will be able to pass to a subgraph where we can find even 
larger isomorphic subgraphs. On the other hand, once we find such star forests, we take a random 
bijection fg from B\ to B2, and extend it by independent bijections from X%j to X%j. To this 
end, we declare /b to be good for the index j if it can be extended to a bijection between the sets 

B\ U X\ j and B2 U X2 j so that the two star- forests overlap in Vt I \X\ | • — -r^p — ^ I edges under 

the map. If some bijection fs happens to be good for a constant proportion of indices j, then we 
can extend the bijection fs to the sets Xxj for these indices, and thereby construct a map / that 
overlaps many edges of G\ and G2. 

To begin this program, our first lemma establishes the tail probability of the main random 
variable in our setting. It is the analogue of Lemma 12.81 

Lemma 3.1. Let a < \ be a fixed positive real number, and let d and n satisfy n2~ie < d < 
\foen logn. Let N±, . . . , N s C [n] be fixed disjoint sets of size | for some s > \n a , and let N be a 
uniformly random subset of [n] with exactly d elements. Then with probability at least 1 — e - r2 ( na/4 ) ; 



there exists an index i such that \N (~) NA > 



a log ri 



81og(^ 



Proof. Let N' be a random subset of [n] obtained by independently taking each element with 
probability The distribution of N 1 conditioned on the event \N'\ < d can be coupled with 
the random variable N, so that N' C N (given N' , let iV be a set of size d containing N' chosen 
uniformly at random). By Chernoff's bound, the probability of \N'\ > d is at most e - ^^-* < 
e ~n(n a 4 ) ^ s i nce d > yi2~te and a < ^. Therefore, in order to prove our lemma, it suffices to show 
that with probability at least 1 — e - r2 (™ a/4 ) ) there exists an index i such that \N'D iVj| > — " log " — 



Define 

n log n 
7 = ^ — and t 



81og(^ 



a logn 
8 log 7 



Since ie < d < \/ an log n, we have 



1 2 
2< — <7<ns log n, 

a 



from which it follows that 



a logn a logn a logn 1 . , 

i > 2 > ^ = 2 > _ ^ (2) 

~ 8 log 7 ~ 81og(nf logn) a logn + 8 log logn ~ 2 ' 



for sufficiently large n. Therefore, the rounding effect in the definition of t at most doubles the 
le, and we have 1 < t < 

For each index i, let Ei be the event that \N' f]Ni\ >t. As \N' f] iVj| is binomially distributed, 
, as in ' 
p) to find 



value, and we have 1 < t < 

3 event that \N' f] Ni\ > t. As \N' C 
just as in the proof of Lemma EHJ we may use the bounds (£) > (%) and 1 — p > e 2p (for small 
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Substitute t < to get 



, , d 2 4 log 7 \ d 2 / log 7 \ _ i°g n 

[Ei] > i r-^ 1 e"^ = — ^ e 2 ^ 

\4n alogn/ \ 07 / 



Since a < |, log 7 > log 2, and t < 4}°^" , this is at least 

alogn 

1 \ 4 log 7 logn _a _ J_ _a _a 3a 

— e 2 T=ra4n 2 T>n4n2=n4. 
7/ 

The are independent because the Ni are disjoint. Therefore the number of Ei that occur 
stochastically dominates a binomial random variable with mean sn -3 "/ 4 > \n a l^ , and we conclude 
by the Chernoff bound that at least one Ei (indeed, several) occurs with probability 1 — e - n ( n °' /4 ') ; 
as desired. □ 

In the previous section, in Lemma 12.91 we exploited the fact that the given graph was random 
and the edges were independent. This trick is too restrictive to be applied to general graphs. 
However, the next lemma says that for star- forests, one can obtain a lemma similar to Lemma 12.91 

Lemma 3.2. Let a < \ be a fixed positive real number, and suppose that n and d satisfy n,2~w < 
d < y/ an log n, and are sufficiently large. For i = 1,2, let Gi be a (d,n a ) -star-forest {(y,N v ) : v £ 
Xi} in the vertex set Xi U Bi, where \Xi\ = n a and \Bi\ = n. The bijection fs from B\ to B2 
chosen uniformly at random satisfies the following property with probability at least 1 — e~ n ( nC " ^ : 
fs can be extended to X\ U B\ so that the graph D G2 has at least \Xi \ ■ l og " — ^ edges. 

Proof. Consider a uniformly random bijection fs from B\ to 1?2- As in the proof of Lemma 12.91 

we will pick vertices of X\ one at a time, mapping each one to some vertex in X^ in such a way 

that their neighbors intersect in at least — a }° s . " — v vertices under the map fn. By repeating this 
to 9lo g (iiispj 

for |Xl|/4 steps, we then extend fs to form a total of at least ■ — Q /°^ " n N, overlapping edges, 



as required. 

To this end, suppose that we have already embedded some set X[ C X\ of size less than |Xi|/4, 
and let X' 2 be the image of X[. Further suppose that we have only exposed the outcome of fs on 
the neighbors of X[. Let B[ = IJ^ex' ^ x anc ^ ^'2 be its image (which is already fully determined 
by our partial exposure). The unexposed remainder of fg, conditioned on the previous outcome, 
is a random uniform bijection from B\ \ B[ to B2 \B' 2 . Choose an arbitrary vertex x\ £ X\ \ X[. 
Call a vertex X2 G X 2 \ X' 2 available if \N X2 \ B' 2 \ > |, or equivalently, \N X2 n B' 2 \ < i. Since each 
unavailable vertex accounts for at least | vertices of \B' 2 \, and those sets are disjoint for different 
unavailable vertices (because G2 is a star forest), we conclude that the number of unavailable 
vertices is at most 

\B' 2 \ _ d\X' 2 \ _ 2\ X '\ < l^ 1 ' 



d/2 d/2 1 M ~ 2 
and hence the number of available vertices in X2 \ X 2 is at least |Xi|/4. 
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We now expose the images of the d neighbors of x\. This is a uniformly random d-element 
subset of B 2 \ B' 2 , where 



(1 - o(l))n = n- d\X[\ < \B 2 \ B' 2 \ < 



n . 



For each available vertex x 2 , its ( deter ministically known) neighborhood in B 2 \ B' 2 has size at 

least d/2, and there are at least |Xi|/4 = n Q /4 such neighborhoods, all disjoint, coming from 

different available vertices. We are therefore in the setting of Lemma 13.11 (with (1 — o(l))n instead 

of n), and so we conclude that with probability 1 — e -^( na//4 ) ; there is an available vertex x 2 

such that \fB(N Xl ) n N X2 \ > — \ . Furthermore, we only need to expose the outcome 

91og (" d?"J 

i x I 

of fs on N Xl . We can continue the process for at least L^Ll times, with probability at least 
1 - J^i • e ~ Q( - na/4 ^ = 1 - e - Q( - na/4 \ This proves the lemma. □ 

Our next proposition bounds the self-similarity of a graph in terms of its median degree. To 
prove the proposition, we will find many star-forests in our graph, and apply Lemma 13.21 several 
times. 

Proposition 3.3. Let a < ^ be a fixed positive real number. Then for every sufficiently large n 
and d satisfying 6n2~Te < d < an log n, every n-vertex graph G with at least ^ vertices of degree 
at least d has l(G) > an 1 ? g " — v 

v ' nrnn l _ ._ f TA log n 1 



2592 log 



Proof. Take a uniformly random partition A\ U A 2 U B\ U B 2 of the vertex set, where \A\\ = \ A 2 \ = 
\Bi\ = \B 2 \ = j. For i = 1,2, let Gi be the bipartite graph formed by the edges between Aj and 
Bi. Since d > n 1 / 3 , by the concentration of the hypergeometric distribution (see, e.g., Theorem 
2.10 of [H]) and a union bound, one can see that a.a.s. each Aj contains at least ^ vertices that 
have at least | neighbors in Bi in the graph Gi. Condition on this event. 

Let d' = A and n' = j, and note that since a < < (n') a < n a . Let k\ be the largest 

integer for which we can find a collection of (d' , (n') a )-star-forests S\j = {(v, N v ) : v £ X\j} in Gi, 

l — a 

where the sets Xij are disjoint subsets of A\, for 1 < j < k\. We claim that k\ > n 18 . Indeed, 
if not, then there exist over ^ — ki{n') a > j| vertices in A\ that are not covered by the sets of 
the form X\j, and have degree at least | in the set B\. Let A\ be the set of these vertices. By 
our maximality assumption, we know that the graph GifA^ U B\\ does not contain a (d' , (n') a )- 
star-forest. Let S = {(v,N v ) : v 6 X} be a (d' , /i)-star- forest in GifA^ U Bi], where X C A[ and 
h is as large as possible. By our assumption, we know that h < (n') a . Then all the vertices in 
A[ \ X have degree at least ^ in the set N = \J veX N v Note that \N\ = d'h < ■ n a and 
\A[ \X\ > ^ - h > In this case, Corollary E21 applied to G[(A[ \ X) U N] already gives 

V ; ~ V LV 1X ' 1J ~ 5\N\-\A[\X\ 500\N\ 950 950' 



,1- 



which for large n is already far more than enough. Therefore, we may assume that k\ > ^ g . 
Similarly, there is a collection of " lg many (d' , (n') a )-star-forests S 2 j = {(v,N v ) : v E X 2 j} in 
G 2 , where X 2 j are disjoint subsets of A 2 . 
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Let fs be a bijection from B\ to B2 chosen uniformly at random. Our initial conditions on n and 
d imply that n' and d! satisfy the requirements of Lemma 13.21 so for each fixed j, with probability 
at least 1 - e~^ n ' >, f B can be extended to a bijection between B\ U X±j and -B2 U X 2 j such that 
fB{G\[B\ U and G2\B2 U -X^j] overlap in at least 

a log 7/ 2n a alog(^) n Q -alogn 
7ZniZ7ZK\ > -5 /n E -,— \ ^ 



361og(^^)' 3 361og(2*$Es) i441og(^ 

edges, where we used nl ^|" > ^ > 25. 

Since the sets X±j are disjoint for distinct j, and X2 j are also disjoint for distinct j, a union 
bound shows that we can independently extend the bijection fs by each X±j — > X2J to construct 
a map f : A —> B which establishes 

, , n 1-a n a • a log n an log n 



18 144 log (^P) 2592 log (^^F 
completing the proof. □ 

We are now ready to prove Theorem 11.21 an d establish the correct order of magnitude of the 
function i(m). 



Proof of Theorem ] i.ffl Consider the random graph G n ^ p with p = y ^p-- For m = ^n 3 / 2 y / logn, 
we a.a.s. have e(G njP ) = (l + o(l))m, and by Theorem II. 3\ L(G n ^ p ) = 0(nlogn) = ©((mlogm) 2 / 3 ). 
Since the function l is monotone, this shows that t(m) < 0((m log m) 2//3 ), and establishes the upper 
bound. In the remainder of the proof, we focus on proving the lower bound. 

Let G be the given graph with n vertices and m edges. Without loss of generality, we may 

assume that G contains no isolated vertices. Let uq = n, m,Q = m, Gq = G, and let Vo be 

2/3 

the vertex set of Go- Let no = 2 a ° ( logr ^ )i/3 f° r some real clq. Let t = 1 in the beginning and 
consider the following iterative process. At each step t, we will either find two large isomorphic 
edge-disjoint subgraphs, or will find an induced subgraph Gt on the vertex set Vt such that for 

m 2/3 

n t = \Vt\-, ni t = \E(Gt)\, and a t satisfying nt = 2 at - — * ^y 3 , we have the following properties: 



(i) Gt has no isolated vertex, 

(ii) m > m t > (l - Zl=l 2~ a <) m > f , and 

(iii) at < at-i — 5 for t > 1. 



Note that the properties indeed hold for t = 0. Suppose that we are given parameters as above 

by Propos 

o 2/3 
8m t ' 

(log mt) 1 / 3 ' 



for some t > 0. If nt > (mj logm^) 2 / 3 , then by Proposition 12.31 we have c(G) > ( mt °g^-0 

gm 2/3 

J7((m log m) 2 ' 3 ). On the other hand, if nt < ~ — mt u y 3 , then by Corollary 12.71 we have 



> P2 ^ ^((mlogm) 2 / 3 ). 



5n, 
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Therefore, we may assume that 



8m 



2/3 



<n t < (mt log mt) 2/3 . (3) 



(log mt) 1 / 3 

from which it follows that 3 < at < log 2 log mt- Define 

dt = ^^=2-^(m t logmt) 1 / 3 , 

Z • Tit 

and let V{ be the subset of vertices which have degree at least a\ in the graph Gt- Using the upper 
bound of ([3j) together with at < log 2 log rait, one can see that 



3/2 , , 1/2 

2 at ■ n t (\ogm t y 



for a = The lower bound of ([3]) gives mt < (^) 3//2 (log rait) 1 / 2 , so using at > 3, we find that 

(n f /8) 3 / 2 Vbg"^ 1 ^- , 

«t < 2at ; < — V n t lo S n t < V« n < lo g n <- 

Consequently, if \V(\ > ^p, then by Proposition 13.31 we have 

an t log n t n t log n t 



t{G)> 



2592 log 64800 log 



Since ^ = and log m t > log n t = a t log 2 + | log m t -\ log log raif > \ log m t , we have 

ra f log ^ ratlogmt = 2Qt . m 2 / 3 Qo™) 2 / 3 

V 7 648001og log rat) (648000 log 2) a t (648000 log 2) a t 1 K ' 

Since at > 3, we have — > 2, and thus 

1 ' at 1 

u(G) > — rra, 2/3 (log rait) 2 / 3 = fi((m log m) 2 / 3 ) . 

v ; 324000 log 2 * v & ; vv * > > 

Otherwise, we have |V^| < Let Vt+i be the set of non-isolated vertices in the induced subgraph 
G[V/]. Let rtf+i = |T4+i| and let mt+i be the number of edges in the induced subgraph Gt+i = 

m 2/3 

G[14+i]- Define at+i so that nt+i = 2 at+1 ( logm t+1 )i/3 ■ Note that since we only removed vertices 
whose degree in Gt was less than a\, our new number of edges is rait+i > rait — n tdt = (1 — 2~ at )mt, 
and in particular is well above rait/2 because at > 3. Property (i) follows from the definition. For 
Property (ii), note that 

mt+i > (1 - 2" at )mt > (1 - 2~ at ) ^ ~ Xj m ° > " m ° ' 
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and moreover, since dj > 3 and dj+i < — | for all i, we have 

Finally, since rrit/2 < mt+\ < mt we have 

nm < I* = 2°*-i m ' /3 < o^-i ( 2m *+i) 2/3 = oat -| m m 

2 (logm^Vs (logmt+i) 1 / 3 (logmt+i) 1 / 3 ' 

from which Property (iii) follows. Note that by Property (iii), at some time s we will reach a s < 3, 
and will be done by Corollary 12.71 m the middle of the process at time s. □ 



4 Concluding remarks 

In this paper, we proved that t(m) = 0((m log m) 2 / 3 ). The upper bound followed by considering 
the random graph G rajP with p = W^p. For this range of p, we have m = 0(n 3//2 (logn) 1 / 2 ), or 
equivalently n = ®( 7j~~p73 ) • By carefully studying the proof of Theorem 1.2, one can notice 
that every graph G with i(G) < 0((m log m) 2 / 3 ) has to be somewhat similar to the above random 
graph. Indeed, by choosing different parameters in the proof, one can see that for every e > 0, such 
graphs G must contain a subgraph on n' = ®( ?i~~7[73 ) vertices with at least (1 — e)m edges, where 
the degree of at least (1 — e)n' vertices is ft(d), for d being the average degree of the subgraph (thus 
d = 0((m log m) 1//3 )). Moreover, the edges of this subgraph are well-distributed, in the sense that 
there does not exist a pair of disjoint vertex subsets X, Y satisfying e(X, Y) 3> \Y\ (since in 

this case we can directly apply Corollary 12. 7p . 

For a positive integer s > 2, let l s (G) be the maximum t for which G contains an s-divisible 
subgraph with t edges, and let L rtS (m) be the minimum of l s (G) over all r-uniform hypergraphs 
with m edges (thus we have t r {m) = i r) 2(m)). By slightly adjusting our proof of the bound 
t(m) = G((mlogm) ' ), we can also prove for fixed constant s that t2,s{ m ) = 0(m 2s - 1 (logm) 2 "- 1 ). 
The upper bound follows by considering the random graph G UtP with p = (^f^-^j ■ For the lower 

s/(2s — 1) 

bound, if n < ( lo |T m )i/(2 3 -i) ' t nen we can use an argument similar to that of Corollary 12.71 and if 

a 2s-2 

n > m 2 "- 1 (log m) 2s -! , then we can use an argument similar to that of Proposition 12.31 In the 
remaining range of parameters, we can proceed as in Section [3j The value — a „ in Lemma 



O will be replaced by Q ( — r J . 
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